Modeling Future Cost for Neural Machine Translation
نویسندگان
چکیده
Existing neural machine translation (NMT) systems utilize sequence-to-sequence networks to generate target word by word, and then make the generated at each time-step counterpart in references as consistent possible. However, trained model tends focus on ensuring accuracy of current does not consider its future cost which means expected generating subsequent (i.e., next word). To respond this issue, article, we propose a simple effective method for NMT systems. In detail, representation is learned based contextual information compute an additional loss guide training model. Furthermore, used help generation decoding. Experimental results three widely-used datasets, including WMT14 English-to-German, English-to-French, WMT17 Chinese-to-English, show that proposed approach achieves significant improvements over strong Transformer-based baseline.
منابع مشابه
Modeling Past and Future for Neural Machine Translation
Existing neural machine translation systems do not explicitly model what has been translated and what has not during the decoding phase. To address this problem, we propose a novel mechanism that separates the source information into two parts: translated PAST contents and untranslated FUTURE contents, which are modeled by two additional recurrent layers. The PAST and FUTURE contents are fed to...
متن کاملModeling Source Syntax for Neural Machine Translation
Even though a linguistics-free sequence to sequence model in neural machine translation (NMT) has certain capability of implicitly learning syntactic information of source sentences, this paper shows that source syntax can be explicitly incorporated into NMT effectively to provide further improvements. Specifically, we linearize parse trees of source sentences to obtain structural label sequenc...
متن کاملNeural Machine Translation with Recurrent Attention Modeling
Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future. We improve upon the attention model of Bahdanau et al. (2014) by explicitly modeling the relationship between previous and subsequent attention levels for each word using one recurrent network per input word....
متن کاملCost Weighting for Neural Machine Translation Domain Adaptation
In this paper, we propose a new domain adaptation technique for neural machine translation called cost weighting, which is appropriate for adaptation scenarios in which a small in-domain data set and a large general-domain data set are available. Cost weighting incorporates a domain classifier into the neural machine translation training algorithm, using features derived from the encoder repres...
متن کاملCost-Aware Learning Rate for Neural Machine Translation
Neural Machine Translation (NMT) has drawn much attention due to its promising translation performance in recent years. The conventional optimization algorithm for NMT sets a unified learning rate for each gold target word during training. However, words under different probability distributions should be handled differently. Thus, we propose a cost-aware learning rate method, which can produce...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2021
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2020.3042006